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ABSTRACT 



In 1998, the Keystone Research Center published a 
comprehensive review of research on two high-profile ideas for raising 
educational achievement: lowering class size in the early grades and 
instituting private school vouchers. This report presents a research update, 
emphasizing results since early 1998. Over the past 2 years, the evidence on 
the achievement benefits of lowering class size in grades K-3 has grown 
stronger. The report discusses the Tennessee Student-Teacher Achievement 
Ratio program and the Wisconsin Student Achievement Guarantee in Education 
program. Evidence that private school vouchers raise student achievement 
remains weak. This report discusses the Milwaukee Parental Choice Program and 
the Cleveland Scholarship and Tutoring Program- -two programs dealing with 
voucher issues. It also discusses implications of reducing class size for 
Pennsylvania. (Contains 85 references.) (DFR) 
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Keystone Research Center 

The Keystone Research Center (KRC), a non-partisan think tank with offices in Harrisburg and the Philadelphia area, 
conducts research on the Pennsylvania economy and civic institutions. This research documents current conditions and 
seeks to develop innovative policy proposals to expand economic opportunity and ensure that all state residents share in 
the benefits of economic growth. 

The Keystone Research Center is a non-profit organization as described in section 501(c)(3) of the Internal Revenue 
Code. All contributions are tax deductible. 



About the Author 

Alex Molnar holds a Ph.D. in Urban Education and since 1972 has been a professor in the School of Education at the 
University of Wisconsin-Milwaukee, where he directs the Center for Education Research, Analysis, and Innovation. 
Previously, he taught high school social studies in the Chicago area. From 1993-95, Professor Molnar served as chief of 
staff for the Wisconsin Department of Public Instruction Urban Initiative. He is currently a member of the team conduct- 
ing the legislatively mandated evaluation of the Wisconsin class-size initiative, the Student Achievement Guarantee in 
Education (SAGE) program. Professor Molnar has edited, written, or co-authored several books, including Changing 
Problem Behavior in Schools (1989), Giving Kids the Business (1996), and The Construction of Children s Character 
(1997). Professor Molnar consults extensively on educational policy and practice issues throughout the United States. 




u .. 



4 




SMALLER CLASSES AND EDUCATIONAL VOUCHERS |ggggg| 

KEY$ TONS' 
SEARCH 
C EWTEffi 

TABLE OF CONTENTS 

EXECUTIVE SUMMARY 5 

The Achievement Evidence on Smaller K-3 Classes 5 

The Achievement Evidence on Educational Vouchers 7 

Implications for Pennsylvania 8 

INTRODUCTION 12 

EDUCATIONAL VOUCHERS: A RESEARCH UPDATE 13 

The Argument Over Vouchers 13 

The Milwaukee Parental Choice Voucher Program 14 

The Achievement Effects of the Milwaukee Voucher Program 15 

The Cleveland Scholarship and Tutoring Program (CSTP) 16 

Official Evaluation Results for the Cleveland Scholarship and 

Tutoring Program 17 

Private Voucher Programs 19 

Milwaukee: Partners Advancing Values In Education (PAVE) 19 

Indianapolis: The Educational Choice Charitable Trust 20 

The New York School Choice Program 20 

The Washington (D.C.) Scholarship Fund 22 

Parents Advancing Choice in Education (Dayton, Ohio) 23 

San Antonio Private Voucher Programs 23 

Vouchers and Educational Equity 24 

SMALLER CLASSES: A RESEARCH UPDATE 26 

The Recent History of Class-Size Research 26 

The Tennessee Student-Teacher Achievement Ratio (STAR) Study 26 

Are the Benefits of Smaller Classes Cumulative and 

Do the Benefits Last? 28 

Project Challenge 32 

Wisconsin: Student Achievement Guarantee in Education (SAGE) 33 

California 35 

CONCLUSION 37 

IMPLICATIONS FOR PENNSYLVANIA 39 

REFERENCES 40 



SMALLER CLASSES AND EDUCATIONAL VOUCHERS 



KEYS TOME 
RESEARCH 
CEDTfiS 



EXECUTIVE SUMMARY 



In January 1998, the Keystone Re- 
search Center published a comprehensive 
review of research on two high-profile ideas 
for raising educational achievement: lowering 
class size in the early grades and instituting 
private school vouchers. 1 In the context of 
continuing debate about these alternatives, this 
report presents a research update, emphasizing 
results since early 1998. 

The Achievement Evidence 
on Smaller K-3 Classes 

Over the past two years, the evidence 
on the achievement benefits of lowering class 
size in grades K-3 has grown stronger. 

The Tennessee Student-Teacher 
Achievement Ratio ('STAR! program . The 
new evidence on class size includes additional 
analyses of data from the Tennessee STAR 
program. Initiated in the mid-1980s, STAR 
was a genuine scientific experiment. K-3 
students in 79 schools were randomly assigned 
to small classes (about 13-17 students), regu- 
lar classes (22-25 students), or regular classes 
with a teacher’s aide. As reported in 1998, 
students in smaller classes achieved signifi- 
cantly higher test scores on average than 
students in regular classes or regular classes 
with a teacher’s aide. The largest gains were 
achieved in inner-city small classes. 

In the last two years, important new 
findings have emerged from the Tennessee 
STAR experiment. 



• The advantages of having attended 
small classes increased as children 
reached higher grades. In grade four, 
students who attended small classes 
throughout K-3 were 6-9 months ahead of 
regular class students in math, reading, 
and science. By grade eight, these advan- 
tages grew to just over one year. 

• Stronger evidence now exists that the 
benefits of smaller classes are cumulative. 
The more years students spent in small 
classes in K-3, the greater were the long- 
term achievement benefits. 

• Students who attended small classes in 
Tennessee took college entrance 
exams at significantly higher rates than 
their peers who attended regular-size 
classes (Figure 1). In a sample of 9,397 
STAR students who were high school 
seniors in 1997-98, almost 44 percent of 
those who attended small classes took 
college entrance exams. This compared to 
40 percent of those who attended regular 
classes. For African-American students, the 
corresponding figures were 40.2 percent and 
3 1 .7 percent respectively. Attending a small 
class reduced the white-black gap in the 
share of students who took college entrance 
tests by 54 percent. 

• More small-class students graduated from 
high school on schedule. In a sample of 
2,857 STAR students, 72 percent of small- 
class participants graduated from high school 
on schedule, compared to 65-66 percent of 
regular class participants. 
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Figure 1. 

Tennessee STAR Students Assigned to Small Classes 
Were More Likely to Take College Entrance Exams 
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Source: Alan Krueger and Diane Whitmore, “The Effects of Attending a Small Class in the Early Grades on College Attendance Plans,” unpublished 
paper, Princeton University, April 9, 1998. 

Notes: Figure shows percent of students who took either the ACT or the SAT exam, by their initial class-size assignment. Sample consists of 9,397 STAR 
students who were high school seniors in 1998. Free lunch group includes students who ever received free or reduced-price lunch grade K-3. 



Figure 2. 

Increase in First Grade Test Scores in Wisconsin, 
Small vs. Regular Classes 
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Source: Alex Molnar, Philip Smith, and John Zahorik, 1997-98 Evaluation Resutls of the Student Achievement Guarantee in Education (SAGE) 
Program (Milwaukee; School of Education, University of Wisconsin-Milwaukee, December 1998), Tables 17 and 29. 
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The Wisconsin Student Achievement 
Guarantee in Education ('SAGE') program . The 
SAGE program reduced the student-teacher 
classroom ratio to 15:1 in 30 schools, begin- 
ning with kindergarten and first grade in 1996- 
97 and adding second and third grade in the 
next two years. The performance of students 
in SAGE schools has been evaluated against a 
comparison group of 14-17 schools (the 
number depending on the year) with similar 
demographic and socioeconomic characteris- 
tics and similar test scores prior to SAGE. 

• After two years, the impact of reduced 
class size in Wisconsin’s SAGE program 
appears to be similar to the impact of 
smaller classes in Tennessee. 

• In both 1996-97 and 1997-98, students in 
small first-grade classes achieved bigger 
increases in test scores in language arts, 
reading, and mathematics (Figure 2). The 
advantage observed among small first- 
grade of classes in 1996-97 was main- 
tained in small second-grade classes. 

• From fall 1997 to spring 1998, first-grade 
African-American students in small 
classes reduced the achievement gap with 
white students by 1 9 percent. (In compari- 
son schools, the white-black achievement 
gap grew 58 percent.) 

• In 1997-98, student achievement in SAGE 
first-grade classes with one teacher and 1 5 
students was not significantly different 
from achievement in classes with two 
teachers and 30 students. This suggests 
that school districts may not need to 
construct new schools and classrooms to 
achieve the benefits of smaller classes. 



The Achievement Evidence on 
Educational Vouchers 

Evidence that private school vouchers 

raise student achievement remains weak. 

• The Milwaukee Parental Choice Program . 
No recent results are available from the 
nation’s first taxpayer-financed voucher 
program because a 1995 legislated expan- 
sion eliminated the evaluation require- 
ment. As reported in 1998, conflicting 
results emerged from three different teams 
of researchers who analyzed the Milwau- 
kee program based on data from 1 990 to 
1995. 

• The Cleveland Scholarship and Tutoring 
Program . The official evaluation of the 
Cleveland voucher program, the nation’s 
second publicly financed voucher pro- 
gram, found no significant difference 
between third-grade voucher students and 
public school students in 1996-97. A 
second research team re-examined these 
data and found gains for voucher students 
in language and science but not reading, 
math, and social studies. These positive 
findings hinge on two controversial meth- 
odological choices, including the use of a 
lower than conventional threshold for 
statistically significant results. 

• In 1997-98, the official evaluation found 
that fourth-grade Cleveland voucher 
students achieved better than their public 
school counterparts in language, but not 
significantly differently in four other 
subjects, including reading and math, if 
statistical controls are included for class 
size, teacher experience, and teacher’s 
educational background. 
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• The evaluation of the Cleveland program 
in 1997-98 also found that student perfor- 
mance in new private schools is signifi- 
cantly worse than student performance in 
public schools. 

Private voucher programs . Several 
new private voucher programs have been 
established in recent years. A New York 
voucher program has generated fragile 
evidence of positive effects of vouchers for 
older students in elementary school. Several 
other private voucher programs are conduct- 
ing evaluations that will produce results in 
the next several years. 

A general problem with small-scale 
voucher experiments is that they tell us little 
about the impact of large-scale programs. 
When small numbers of low-income students 
are placed in established private schools, 
these students often benefit from “peer 
effects.” That is, they benefit from attending 
school with students who come from rela- 
tively more affluent families, have relatively 
more educated parents, or have parents who 
are more actively involved in their education. 

In larger-scale experiments, new 
private schools must be established and 
existing ones substantially expanded. Peer 
effects may be similar to those in public 
schools. Differing peer effects may explain 
why voucher students in new private schools 
in Cleveland performed worse than students 
in public schools, while voucher students in 
established private schools performed better. 



Implications for Pennsylvania 

As the results of the Tennessee class- 
size experiments and the Wisconsin SAGE 
evaluation have become more widely known, 
reducing class size has become a favorite of 
state and federal legislators, as well as parents, 
across the country. In California, smaller 
classes have been introduced so rapidly and on 
so a large scale that the achievement benefits 
and the cost-effectiveness of the reform may be 
reduced. 

Pennsylvania has a rare opportunity to 
introduce a class-size reduction program 
targeted on the areas in which it would generate 
the greatest benefits and designed scientifically 
to generate knowledge of how to improve 
educational achievement in a cost-effective 
manner. Such a SMART (Scientific Methods, 
Achieving Results Today) class-size program 
should begin by reducing class size in kinder- 
garten and first grade. As in Wisconsin, prior- 
ity should be placed on lowering class size in 
schools that serve high proportions of low- 
income students. Smaller classes should be 
introduced in the rest of the state on an experi- 
mental basis, as should smaller classes in 
second and third grade. Building on 
Wisconsin’s experience, Pennsylvania should 
evaluate the benefits of combining class-size 
reductions with other (e.g., curricular and 
teacher training) innovations. 

As SMART class-size program students 
progress through higher grades, Pennsylvania 
should track social indicators of well-being as 
well as achievement test scores. In Wisconsin, 
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the interest in smaller classes that led to the 
SAGE program stemmed from their potential 
social as well as achievement benefits. A 
statewide Urban Initiative task force (which 
included bipartisan legislative and business 
leaders) believed that smaller K-3 classes 
might reduce youth violence by increasing the 
chance that children entering school would 
find an adult who knows and cares for them. 

The Tennessee STAR experiment 
represents not just a shining example of 
scientific educational research but also an 
inspiring illustration of bipartisan politics at 
its best. STAR was the result of a compromise 



between legislators who wanted widespread 
class-size reductions and those who consid- 
ered them too expensive given the quality of 
the evidence on their benefits. 

Pennsylvania now has a chance to 
achieve a similarly historic advance. It can 
invest in high-payoff class-size reduction for 
low-income students while conducting sys- 
tematic analysis of what additional invest- 
ments would make the most sense. A dozen 
years from now, a SMART class-size program 
could win for Pennsylvania the kind of recog- 
nition now accorded the Tennessee STAR 
experiment. 
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Voucher and Class Size Resources on the World Wide Web 

Vouchers 

Milwaukee Parental Choice Program: http://www.dpi.state.wi.us/dfrn/sms/choice.html 

Information about the Milwaukee Parental Choice Program can be found on the Wisconsin DPI homepage. 

Program on Educational Policy and Governance: http://data.fas.harvard.edu/pepg/ 

Voucher program evaluations by Professor Paul Peterson and co-authors can be found by following the “Papers” link. 
Partners Advancing Values in Education (PAVE): http://www.pave.org 

Homepage of the PAVE program. For other information on PAVE, see http://www.ceoamerica.org/info/Milwaukee.html. 
Indiana Center for Evaluation: http://www.indiana.edu/~iuice/ 

Homepage of the Center that conducted the state-funded evaluation of the Cleveland Scholarship and Tutoring Program. 

School Choice 1999: What’s Happening in the States, by Nina Shokraii Rees and Sarah E. Youssef: 
http://www.heritage.org/schools/ 

An annual state-by-state report on voucher and charter school developments found on the Heritage Foundation homepage. 

Parents Advancing Choice in Education (PACE): http://www.ceoamerica.org/info/Dayton.html 
An information page about PACE, the privately funded Dayton, Ohio, voucher program. 

Educational Choice Charitable Trust: http://www.ceoamerica.org/info/Indianapolis.html 
An information page about Indianapolis’ privately funded voucher program. 

Washington Scholarship Fund: http://www.wsf-dc.org 

The homepage of the privately funded voucher program in the District of Columbia. 

CEO America: http://www.ceoamerica.org 

Homepage of CEO America, an umbrella group that supports privately funded voucher programs around the country. 

CEO Horizon-Edgewood Program: http://www.ceoamerica.org/horizon-news.html 
Information about CEO America’s Horizon-Edgewood voucher program in San Antonio, Texas. 

Mathematica Policy Research, Incorporated: www.mathmatica-mpr.com 

The website of Mathematica, a company that evaluates voucher programs, including the New York program. 

Class Size 

“The Evidence on Class Size,” Eric A. Hanushek: http://www.edexcellence.net/library/size.html 
Hanushek’s 1998 analysis of the various class-size reduction programs around the country. 

Student Achievement Guarantee in Education Evaluation Project: http://www.uwm.edu/SOE/centersprojects/sage/ 

The website of the evaluation of Wisconsin’s class-size reduction program. 

HEROS (Health and Education Research Operative Services, Inc.): http://www.telalink.net/~heros 

The HEROS site includes links to many studies and commentaries on the Tennessee STAR class-size reduction program. 

California Class Size Reduction Program: http://www.cde.ca.gov/ftpbranch/sfpdiv/classize/ 

The website for California’s CSR program, on the California Department of Education homepage. 

WestEd: http://www.wested.org/ 

WestEd is the regional research laboratory serving Arizona, California, Nevada, and Utah. See http://www.wested.org/policy/ 
pubs/full_text/class_size/ for WestEd’s 1998 evaluation of California’s class-size reduction. 
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A NEW BOOK BY THE STAFF OF THE KEYSTONE RESEARCH CENTER 




Three-quarters of the American workforce is now employed in 
services, a substantial portion in low-paying, dead-end jobs. Can the 
service economy do as well by the American worker as the manufac- 
turing economy once did? Can the widely shared prosperity that 
accompanied steady increases in productivity and performance in 
manufacturing be replicated in the services? They can and they will, 
the authors of this timely book contend, but only if outmoded policies 
and practices are brought into line with the new economy. New Rules 
for a New Economy explains why this goal must be accomplished and 
how we can start. 

The authors call for new, decentralized institutions suited to a 
dynamic economy in which change is constant and rapid. In particu- 
lar, they see a need for job ladders and worker associations that cut 
across firm boundaries. These institutions would foster individual 
and collective learning, mark out career paths, and facilitate coordina- 
tion among both individuals and organizations in a networked 
economy. The authors propose new rules to reshape labor market 
institutions and policy, improving economic performance and 
opportunities for workers. 



REVIEWERS’ PRAISE FOR NEW RULES FOR A NEW ECONOMY 



“A stimulating book.” 



- Financial Times of London 



“A challenge to liberals by fellow liberals to rethink their traditional economic policies.” 

- The New Democrat 



“If you want to know why wages are stagnant and social inequality is growing, this book is the place to 
start. And. . . the authors propose the kinds of reforms - simultaneously practical and radical - necessary to 
bring about change.” 

- Nelson Lichtenstein, University of Virginia 

“This book should serve as a springboard for a serious public debate of what it would take to reverse rising 
inequality and make America's economy deliver again for more than a small minority.” 

- Richard Leone, Twentieth Century Fund 



“A stellar achievement that breaks new ground.” 

- Harley Shaiken, University of California, Berkeley 
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INTRODUCTION 



In January 1998, the Keystone 
Research Center published Smaller Classes - 
Not Vouchers - Increase Student Achievement, 
a comprehensive synthesis of research on the 
achievement consequences of instituting 
private school vouchers and of reducing class 
size in the early grades. Nationally and in 
Pennsylvania, vouchers and smaller classes 
have remained the focus of intense interest 
during the past 15 months. Important new 



research findings have been published on the 
effect of vouchers and smaller classes. For 
policymakers and the general public, this 
update summarizes the latest evidence, plac- 
ing it in the context of key findings from 
earlier research. Although this report is self- 
contained, readers interested in comprehen- 
sive evaluations of earlier research may wish 
to read this document in combination with the 
earlier Keystone report. 



What Are These Researchers Talking About? A Glossary of Terms 

CSTP: The Cleveland Scholarship and Tutoring Program, the official name of the Cleveland voucher program. 

Control (as in “control for” and “control group”): To evaluate the impact of a voucher program or smaller class size 
on student achievement, analysts need to isolate their impacts from those of other variables (such as a students’ family 
background). This can be done by comparing the performance of the students who get vouchers or attend small classes 
with the performance of another group of students— a “control group” — that is as similar as possible except for not 
having received vouchers or attended a small class. In statistical analysis, researchers usually take explicit account of — 
or “control for” — family and individual difference, so that the impact of vouchers or class size will not be incorrectly 
estimated. 

Effect size: To evaluate the benefits of vouchers or smaller class sizes, you need to know how big an impact they have 
on student achievement. Effect sizes gauge this impact by looking at the gap in test scores between students who receive 
vouchers or attend small classes and students who don’t. This gap is divided by a measure of the overall spread of student 
scores. (See standard deviation.) 

Meta-analysis: When a large number of studies has been conducted on a subject — such as the achievement impact of 
small classes — a systematic evaluation, or meta-analysis, of these prior studies may be used as a tool for determining the 
overall weight of the evidence. In weighing the importance of each study, the meta-analysis takes into account such 
factors as its sample size and the quality of the research methods used. 

MPCP: Milwaukee Parental Choice Program, the official name of the Milwaukee voucher program. 

Percentile ranks: To evaluate the benefits of vouchers or smaller class sizes, you need to know how big an impact they 
have on student achievement. One way to do this is to consider how much an improvement in test scores would move a 
student up in the overall student ranking. If an improvement would move a student up from, say, the mid-point of the 
achievement curve (the 50th percentile) past another 10 percent of students (to the 60th percentile), it would be said to 
have improved scores by 10 percentile ranks. 

Standard deviation: a measure of how spread-out a group of numbers (such as student test scores) is. It equals the 
square root of the average squared difference between each test score and the average test score. 

Statistical significance: In evaluating the impact of vouchers or class size on test scores (or of any variable on another 
variable), researchers want to know whether they can be confident that an observed performance difference is large 
enough that it could not have occurred by random chance. If the difference is so large that it could only have occurred by 
chance with a small probability (“small” being defined customarily as 5 times out of 100), then the observed change in 
performance is considered to be statistically significant. 
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EDUCATIONAL VOUCHERS: A RESEARCH UPDATE 



The Argument Over Vouchers 

Proponents of vouchers tend to base 
their position on three widely held beliefs 
about public education: 

1 . that educational outcomes have deterio- 
rated, 

2. that American public education costs have 
accelerated unreasonably, and 

3. that the public schools cannot reform 
themselves because of bureaucratic and 
political constraints. 

Each of these beliefs is subject to 
serious challenge. There is considerable 
evidence that educational outcomes have 
actually improved over the last 20 years. A 

1993 report written by scientists at the Sandia 
National Laboratories found that U.S. public 
education performance was improving. 2 
Between the 1970s and 1990, according to a 

1994 RAND study, reading and math scores 
rose significantly for Hispanics and African- 
Americans. 3 In a March 1 998 article, 
Princeton University economist Alan Krueger 
reported that National Assessment of Educa- 
tion Progress (NAEP) exams reveal rising 
American public school performance over the 
past 20 years. 4 For example, a student scoring 
in the 50th percentile today performs as well 
as the 56 lh -percentile student 25 years ago. 5 
The most disadvantaged students have made 
the greatest gains. Moreover, between the 
early 1 970s and 1 990, the black-white NAEP 
test-score gap for 17-year-olds decreased by 
almost half (before increasing slightly in the 
1990s). 6 

Contrary to the second widely held 
perception driving support for vouchers, 



Richard Rothstein found that resources for 
regular classrooms at public schools have 
increased only modestly over the last several 
decades. 7 Rothstein reached this conclusion 
by identifying expenditures on special educa- 
tion, transportation, and other activities out- 
side the regular classroom. In a survey of nine 
school districts, he found that inflation- 
adjusted per-pupil spending for regular educa- 
tion rose by only 28 percent from 1 967 to 
1991. In Los Angeles, inflation-adjusted per- 
pupil spending on regular education declined 
by 3.5 percent over the same period. If this 
decline in spending for regular education 
typifies developments in urban areas, it may 
help explain worsening relative academic 
outcomes in some urban public schools. 
Rothstein’s research also suggests that care- 
fully targeted increases in spending on regular 
classroom instruction in urban areas may 
increase both parental satisfaction and student 
achievement. 

Of course, national statistics about 
gradually improving performance and the 
stagnation of funds flowing to regular class- 
rooms in urban school districts are of little 
comfort to parents convinced that their own 
children will not get the lift they need from the 
local public school. 

Parents who want better schools for 
their kids now have been a receptive audience 
for the third widely held belief that underlies 
support for vouchers today: that public schools 
are incapable of reforming themselves because 
of bureaucratic and political constraints. This 
argument gained intellectual legitimacy with 
the 1990 publication of Politics, Markets, and 
America 's Schools by John Chubb and Terry 
Moe. 8 In their book, Chubb and Moe argued 
that private school vouchers are needed 
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because private schools exhibit superior 
academic performance and because public 
school performance has not improved despite 
reforms instituted during the 1980’s. 9 

Chubb and Moe’s claims notwithstand- 
ing, the research literature contains no clear 
evidence that private schools are better than 
public schools. Moreover, since most of the 
studies in the literature on public versus 
private schools use data for secondary schools, 
they are of limited value in predicting the 
impact of voucher programs that, for the most 
part, involve private elementary schools. 10 

Many proponents of private school 
vouchers, such as Wisconsin Assembly mem- 
ber Annette “Polly” Williams, author of the 
Milwaukee Parental Choice Program legisla- 
tion, link vouchers to their desire to empower 
poor families and raise the academic achieve- 
ment of poor children. They argue that vouch- 
ers may improve achievement by forcing the 
public schools to compete in an educational 
marketplace in which poor parents hold the 
power of the purse. What does the research 
evidence show? 

The Milwaukee Parental Choice 
Voucher Program 

Private school vouchers have been 
debated at the state level for over 20 years. 
However, voucher legislation has become law 
in only three states, Wisconsin (1990), Ohio 
(1995), and now Florida (1999). 

Wisconsin established the country’s 
first publicly funded private school voucher 
program in Milwaukee. Today, the Milwaukee 
Parental Choice Program (MPCP) is the 
voucher program for which the greatest 
volume of systematic data is available. 



The MPCP initially allowed up to 1 
percent of low-income Milwaukee Public 
School students (about 1 ,000 students) to 
attend participating private, non-sectarian 
schools within the city (Table 1). The pro- 
gram defined “low-income” as below 175 
percent of the official U.S. poverty line. Each 
child attending a private school in the program 
receives a voucher worth the per-pupil equal- 
ized state aid to the Milwaukee Public 
Schools, originally set at $2,446 and currently 
$4,894 (in 1998-99). The Wisconsin legisla- 
tion that created Milwaukee’s Choice program 
provided for yearly evaluations of the aca- 
demic achievement of students attending 
Choice schools. 

In 1993, the Milwaukee Parental Choice 
Program was modified to raise (effective 1994- 
95) the number of students who could participate 
from 1 percent to 1.5 percent of the Milwaukee 
Public School population (i.e., to about 1,500 
students). A 1995 change allowed religious 
schools to participate in the MPCP and raised 
the eligibility ceiling to 7 percent of the Milwau- 
kee Public School enrollment in 1995-96 and 15 
percent in 1996-97. 

The 1995 revision of the MPCP, deemed 
constitutional by the Wisconsin Supreme Court 
on June 10, 1998, does not require that the 
schools participating in the program gather the 
achievement data necessary for a comprehensive 
evaluation. Because the necessary data are 
unavailable, no evaluation of the achievement 
impact of the program since 1 995 has been 
conducted. Although the Wisconsin Legislative 
Audit Bureau is required to issue a report in the 
year 2000, no meaningful evaluation of the 
achievement impact of the program since 1995 is 
likely in the future. 
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Table I! 

Milwaukee Parental Choice Program Profile 1990-99 



School Year 



Number 
of Schools 


Number of 
Applicants 


Average 
Number 
of Voucher 
Students* 


Voucher 

Amount 


Total Cost 
of Vouchers 
(millions of 
dollars) 


7 


577 


300 


$2,446 


$0.73 


6 


689 


512 


$2,643 


$1.35 


11 


998 


594 


$2,745 


$1.63 


12 


1049 


704 


$2,985 


$2.10 


12 


1046 


771 


$3,209 


$2.47 


17 


— 


1288 


$3,667 


$4.61 


20 


__ 


1616 


$4,373 


$7.07 


23 

881 


: 


1497 

5806** 


$4,696 

$4,894 


$7.03 

$28.41** 



Annual 

Attrition 

Rate 

(percent) 



1990- 1991 

1991- 1992 

1992- 1993 

1993- 1994 

1994- 1995 

1995- 1996 

1996- 1997 

1997- 1998 

1998- 1999 



46% 
35% 
31 % 
27% 
28 % 



.. Information not available. 

•Calculated as the average of September and January memberships, plus summer school membership. 

••Estimate. 

t There are three schools within one organization. Seeds of Health. 

Sources: State of Wisconsin Department of Public Instruction web page, http://www.dpi.state.wi.us/dpi/dfm/sms/histmem.html: Jom r. Witte, troy u. 
Sten. and Christopher A.Thom. Fifth-Year Report: Milwaukee Parental Choice Program (Madison. Wl: Robert M. La Follette 



The Achievement Effects of The Milwaukee 
Voucher Program 

The 1998 Keystone report contained an 
extended discussion, summarized only briefly 
below, of the findings of research on the 
Milwaukee voucher program." Since the 
release of that report, no new research has 
been published on the program (although the 
head of the official evaluation team, John 
Witte, did publish a new synthesis of his prior 
work). 12 

In considering the Milwaukee voucher 
program’s achievement effects, four features 
should be kept in mind that make the program 
difficult to evaluate. 

1 . During each of the evaluation years ( 1 990- 
95), the program enrolled less than 800 stu- 
dents (Table 1 ). 



2. The parents of the 300-800 students in the 
program during the evaluation years had more 
education and higher academic expectations 
than the parents of most of the other 60,000 
eligible Milwaukee Public School students. It 
is possible that students of parents with more 
education and higher expectations would 
achieve faster whether in public schools or 
voucher schools. 

3. More than 80 percent of Milwaukee 
voucher students in the evaluation years 
attended three schools with established reputa- 
tions. At best, the Milwaukee voucher experi- 
ment tells us something about how these 
particular private schools compare with the 
Milwaukee public schools as a group. It 
indicates nothing about the impact of a larger- 
scale voucher program in which some students 
attend new private schools. 
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Keeping these program characteristics 
in mind, the following conclusions about the 
achievement consequences of the MPCP can 
be drawn from the results of the three research 
teams that analyzed the Milwaukee data. 

1 . Disagreement exists about whether the 
voucher program generates positive achieve- 
ment outcomes compared to the Milwaukee 
Public School system. Two of three research 
teams, including the methodologically most 
sophisticated (Cecilia Rouse of Princeton 
University), found no positive outcomes for 
the voucher students in reading. Two of three 
research teams, including Rouse, found 
positive outcomes for voucher students in 
math. 

2. Rouse found that a group of Milwaukee 
public schools that have small classes and 
serve low-income students perform as well as 
voucher schools in math and better than 
voucher schools in reading. Rouse also dis- 
covered that voucher schools appear to have 
smaller classes than any of three sub-groups of 
Milwaukee public schools. Thus, any 
achievement benefit of voucher schools 
compared to the Milwaukee Public School 
system overall may be a result of smaller 
classes rather than any inherent advantage of 
private over public schools. 

Rouse’s final word on the Milwaukee 
voucher program is: 

If we really want to “fix” our educa- 
tional system, we need a better under- 
standing of what makes a school 
successful, and we should not simply 
assume that market forces explain 
sectoral [i.e., public-private] differ- 



ences and are therefore the magic 

solution for public education. 13 

The Cleveland Scholarship and 
Tutoring Program (CSTP) 

Ohio enacted the Cleveland Scholar- 
ship and Tutoring Program (CSTP) legislation 
in March 1 995 (Table 2 profiles the pro- 
gram). 14 The CSTP legislation allowed the 
Ohio Superintendent of Public Instruction to 
create a pilot voucher program in Cleveland. 
The Cleveland program is largely supported 
by money from Ohio’s Disadvantaged Pupil 
Impact Aid Program, previously earmarked for 
the Cleveland Public Schools. 

Scholarship recipients are selected by 
lottery with priority going to applicants whose 
family income is less than the Federal poverty 
level. Second priority goes to families whose 
income is less than twice the poverty level. 
There is no income cap on participation. 

The approximately 30,000 K-3 stu- 
dents who reside within the Cleveland School 
District are eligible to apply to the program. 
Once admitted to the program, students may 
receive scholarships through eighth grade 

Since the Cleveland voucher program 
allows religious schools to participate, its 
constitutionality was immediately challenged. 
On July 31, 1996, the Franklin County Court 
of Common Pleas held the program constitu- 
tional and allowed it to be implemented. On 
May 1, 1997, an Ohio appeals court ruled the 
program unconstitutional. The Ohio Supreme 
Court allowed the program to go forward 
while it considers an appeal. It has not yet 
issued a ruling. 
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~ Table 2. 

Cleveland Scholarship and Tutoring Program Profile 
1996-9? 




School Year 


Number of 
Schools 


Number of 
Applications 


Number of 

Voucher 

Students 


Average 
Value of 
Voucher 


Total Cost 
of Vouchers 
(millions) 


Annual 

Attrition 

Rate 


1996-1997 


56 


6,244 


1,994 


$1,750 


$3.18 


17% 


1997-1998 


57 


6,811 


1,289 


$1,776 


$4.74 


14% 


1998-1999 


60 


4,429 


1,320 


- 


- 


- 



— Information not available. 

‘Includes figures only for the voucher component of the program, not the tutoring component. 
“As of June for each school year. 

Source: Ohio Department of Education. 



On June 24, 1997, Professor Paul 
Peterson of Harvard issued a press release that 
some observers interpreted to mean that his 
research team was conducting the official 
evaluation of the Cleveland program. In fact, 
his study was privately funded, not commis- 
sioned by the Ohio Department of Education. 

Three months later, in September, 
Peterson and co-authors Jay Greene and 
William Howell (PGH) released a report that 
analyzed test scores from two private schools, 
Hope Central Academy and Hope Ohio City 
Academy. The achievement results were 
expressed as percentile-rank changes on fall 
(1996)-to-spring (1997) testing. PGH report 
overall K-3 percentile-rank changes of +5.6 
(reading), -4.5 (language), +11.6 (math total), 
and +12.8 (math concepts). Most schools, 
however, gain every spring and fall back the 
next autumn. Indeed, as PGH report in a 
subsequent paper, by fall 1997 no significant 
gains for Hope students were observed in 
math concepts and no gains were observed in 
language. (Significant gains were still ob- 
served in total math and reading scores. 15 ) 



More important, for changes in test scores to 
be meaningful, a carefully chosen comparison 
group must also be tested. The September 
1997 PGH analysis had no such comparison 
group. Instead, it made a comparison to low- 
income Milwaukee voucher applicants whose 
results were not from the same test used by the 
Hope schools. The September 1997 PGH 
evaluation is so flawed that it contributes little 
if anything to an understanding of how 
voucher programs might affect student 
achievement. 

Official Evaluation Results for Cleveland 
Scholarship and Tutoring Program 

The legislatively mandated indepen- 
dent evaluation of the Cleveland Scholarship 
and Tutoring Program is being conducted by 
an Indiana University research team headed by 
Professor Kim Metcalf. This team published 
reports on the program’s first year (1996-97) 
in March 1998 and second year (1997-98) in 
November 1998. 16 
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To evaluate the Cleveland voucher 
program, Metcalf’s team compared the test 
scores of third-grade voucher recipients with 
those of Cleveland Public School students, 
controlling for prior test scores and family 
characteristics. In 1 996-97, the Metcalf 
evaluation examined third grade performance 
because that was the lowest grade for which 
usable test data (from second grade) existed to 
measure student ability prior to the voucher 
experiment. 

The first-year official evaluation report 
found that, after controlling for background 
characteristics, third-graders participating in 
the voucher program in 1996-97 did not 
achieve at a higher level (on reading, lan- 
guage, mathematics, science, and social 
studies tests) than students who remained in 
the Cleveland Public Schools. The second- 
year report (1997-98) found that fourth-grade 
students in the voucher program achieved 
significantly better than their public school 
counterparts in science and language. When 
classroom variables (e.g., class size, teacher 
experience, and teacher level of education) are 
accounted for, the voucher students achieved 
significantly higher scores only in language. 

The Peterson team criticized the 
Metcalf team’s first-year report for several 
reasons. 17 PGH argued against the use of 
second grade test data as a control for student 
performance prior to the voucher program on 
the grounds that these test results “lack plausi- 
bility.” PGH deemed these test scores implau- 
sible because the scores showed low-income, 
largely single-parent families performing close 
to the national average in the second grade and 
then scoring at substantially lower levels the 
next year. PGH also maintained that the 



second-grade test scores have implausibly 
weak correlations with family background 
characteristics. Leaving out the second-grade 
test scores, however, means that any compari- 
son of voucher student achievement with that 
of public school students takes no account of 
differences in student performance prior to the 
program. Moreover, if the second-grade test 
scores were uniformly inflated for both 
voucher students and those who remained in 
the Cleveland Public Schools (e.g., because 
second-grade public schools “teach to the 
test”), they would still be a good control 
measure. 

PGH also maintained that the Metcalf 
evaluation team should have included student 
scores from the Hope schools, since 25 per- 
cent of voucher students went to these newly 
created schools. Metcalf’s team had excluded 
the Hope schools because their students took a 
different test than the public school students 
and students at other voucher schools. An 
additional problem with including Hope 
students is that approximately 58 of the 155 
Hope students tested in the spring of 1 996 
appear not to have been tested in the fall of 
1997, an unusually high attrition rate. Without 
information on the characteristics of these 
students it cannot be known what impact their 
absence may have had on the results reported. 

When PGH reanalyzed the official data 
excluding the second-grade test scores and 
including the Hope students with converted 
scores, they found that voucher students 
scored significantly higher in language and 
science, but not significantly higher in math, 
reading, or social studies. When the second- 
grade test scores were included, the Peterson 
team found results consistent with those of the 
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official evaluation team: voucher students did 
not score significantly higher than their public 
school counterparts at conventional levels of 
statistical significance. Using a lower statisti- 
cal significance threshold than conventional 
(the .10 level, a 10 percent chance that the 
results could have occurred by chance), PGH 
found that voucher students did better in 
language and science, but not in reading, 
math, and social studies. 

The second-year (1997-98) Metcalf 
team evaluation also found that not all schools 
participating in the voucher program had 
similar achievement results. Students attend- 
ing established private schools were respon- 
sible for the voucher student achievement 
advantage in science and language. Students 
in the newly established private Hope schools 
scored significantly lower than their public 
school counterparts in all tested areas. 

The finding that student performance 
in the new voucher schools is significantly 
worse than student performance in public 
schools raises serious questions about the 
viability of voucher programs as a large-scale 
education reform. Existing private schools 
may produce benefits for low-income students 
by placing them with a majority of students 
from more privileged or more academically 
oriented backgrounds. The adoption of large- 
scale voucher programs may, however, alter 
the social context that produces whatever 
achievement benefit there may be for low- 
income minority students in attending private 
schools. 18 



Private Voucher Programs 

Voucher programs supported by private 
sources provide another potential source of 
information on the educational consequences 
of vouchers. In 1998-99 there were 41 pri- 
vately funded voucher programs in the United 
States, according to Troy Williamson of the 
CEO America Foundation (interview, March 
29, 1999). There have been few systematic 
efforts to study the impact these programs are 
having on student achievement. This section 
describes those programs for which achieve- 
ment data exist or for which an evaluation 
plan that will provide achievement informa- 
tion has been adopted. 

Milwaukee: Partners Advancing 
Values In Education (PAVE) 

Perhaps the country’s largest private 
program operates in Milwaukee. Partners 
Advancing Values in Education (PAVE) - 
formerly the Milwaukee Archdiocesan Educa- 
tion Foundation - was founded in 1992. PAVE 
provides low-income families with scholar- 
ships worth half of the tuition charged by a 
private religious or non-sectarian school up to 
a maximum of $1,000 for elementary and 
middle school students and $1,500 for high 
school students. PAVE’s major donors include 
the Lynde and Harry Bradley Foundation, 
TREK Corporation, CEO America, Johnson 
Controls, Northwestern Mutual Life Insurance 
Co., Siebert Lutheran Foundation, and Wis- 
consin Electric Power. 



SMALLER CLASSES AND EDUCATIONAL VOUCHERS' 



wemwm 

REEEARCH 

CiEINTifelt 



Of the five evaluations of the PAVE 
program, only the 1 994 report made a serious 
effort to determine the program’s effect on 
student achievement. 19 The 1994 evaluation 
suggested that students who attended private 
schools for their entire school career achieved 
at higher levels than students who transferred 
from a public school into a private school 
participating in the PAVE program. Further, 
the evaluation suggested that the longer 
transfer students stayed in participating private 
schools the greater their achievement. 

Unfortunately, since the data gathered 
depended entirely on the voluntary coopera- 
tion of parents, the findings are suspect and no 
conclusion can be drawn from the evaluation’s 
results. 

Indianapolis: The Educational Choice 
Charitable Trust 

The Educational Choice Charitable 
Trust was established in 1991 with a $1.2 
million grant from J. Patrick Rooney, Chair- 
man and CEO of Golden Rule Insurance 
Company. The Trust provides educational 
vouchers worth half the cost of private school 
tuition up to a maximum of $800. Families 
with children who qualify for the free or 
reduced-price lunch program and live in the 
Indianapolis school district are eligible. Half 
the money in the program was reserved for 
families whose children were in private 
schools prior to the creation of the program. 

In March 1996 the Hudson Institute 
issued a report by David Weinschrott and 
Sally Kilgore assessing the impact of the 
program. 20 Public school students, but not 
voucher students, showed a drop-off in read- 



ing, language, and math scores in sixth and 
eighth grade. 

Weinschrott and Kilgore described 
their evaluation framework as “informal.” It 
was based on a small number of voucher 
students enrolled in a handful of voucher 
schools. The analysis did not control for 
differences in student characteristics, test 
scores prior to the voucher program, or other 
potentially significant variables that may have 
influenced the findings. 

The New York School Choice Program 

The New York City School Choice 
Scholarships Foundation (SCSF) was estab- 
lished in 1997 with $5 million of its $7 mil- 
lion commitment coming from New York 
businesspeople. SCSF offers tuition vouchers 
worth up to $ 1 ,400 to students whose family 
income makes them eligible for the free 
school lunch program. Eighty-five percent of 
the scholarships are reserved for public school 
students whose test scores are below the 
citywide median. In its first year (1997-98), 
the program offered scholarships for up to 
1 ,300 students and actually placed about 1 ,200 
students in private schools. In 1998-99, an 
additional 1 ,000 students participated in the 
program. SCSF has made four-year commit- 
ments to the current participants and will add 
more students as funding permits. 

Of parents expressing interest in the 
program, a randomly selected group were 
interviewed to determine their eligibility, 
while their children (except for kindergartners) 
were administered the Iowa Test of Basic 
Skills in reading and math. A lottery deter- 
mined which eligible students would be 
offered vouchers. 
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In the spring of 1997, Mathematica 
Policy Research and Paul Peterson of the 
Harvard Program on Education Policy and 
Governance began a three-year evaluation of 
the performance of students entering the New 
York SCSF Program in 1997-98. 21 The evalu- 
ation examines two issues. (1) It compares the 
achievement of about 750 students who used 
vouchers with that of 960 students whose 
families sought but did not receive a scholar- 
ship. (Ten percent of the non-voucher stu- 
dents ultimately attended private school 
anyway.) (2) The evaluation also compares 
the achievement of 1 ,000 students offered a 
voucher, including some students that did not 
use one, with that of the same control group of 
960 students. 

A limitation of the first comparison is 
that although a random group of students 
received scholarship offers, a non-random 
group appears to be have accepted offers. 
According to Peterson, Myers, and Howell 
(PMH), families that used scholarships had 
higher incomes and more education than 
families that did not use scholarships. 22 PMH 
used standard statistical procedures to control 
for differences between voucher users and 
students not offered a scholarship. However, 
they did not provide enough information about 
these procedures to permit a complete evalua- 
tion of them. 

The second comparison gets around 
the non-random nature of the group that 
actually used scholarships by taking advantage 
of the “natural experiment” resulting from the 
use of a random lottery to select those offered 
vouchers. As a result of this lottery, the 
background characteristics of those offered 
scholarships and of those not offered scholar- 
ships may be assumed to be, on average, the 



same. Any differences between the two 
groups can be attributed to the “offer” of a 
scholarship. This comparison, however, is 
somewhat difficult to interpret. Why would 
the offer of a scholarship be expected to make 
a difference to the performance of students 
who do not actually accept the scholarship? 

In November, 1998, PMH released 
first-year evaluation results. They found that 
being offered a voucher raised performance 
significantly in math in second, third, and fifth 
grades, and in reading in fifth grade. In third 
grade, being offered a voucher was negatively 
correlated with math and reading achievement 
but not significantly so. The effect on 
achievement of actually receiving a voucher 
was statistically significant in math in second, 
fourth, and fifth grade, and in reading in fifth 
grade. In third grade, receiving a voucher was 
negatively correlated with math and reading 
achievement but not significantly so. 

PMH increased the number of so- 
called significant results by using a statistical 
method that requires assuming vouchers can 
increase but not decrease student achieve- 
ment. 23 The conflicting results reported in the 
literature on vouchers and public versus 
private schools make this assumption ques- 
tionable. Without this assumption, only the 
results for fourth-grade math, fifth-grade 
reading, and combined fourth- and fifth-grade 
math are significant. In addition, the differ- 
ences between the results across grade levels 
are hard to interpret. This suggests that the 
results should be treated with caution until 
more data are available. 

Since the PMH evaluation of the New 
York SCSF program constructs comparison 
groups, it is more informative than the PGH 
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analysis of the two Hope Schools in Cleve- 
land. However, as PMH acknowledge, their 
SCSF evaluation involved a small number of 
students and the impact of a much larger 
program could have quite different program 
outcomes. A number of characteristics of the 
schools attended by voucher students in the 
New York experiment might not exist in a 
large-scale experiment. For example, com- 
pared to the schools attended by the control 
group, the voucher schools had small classes 
and were somewhat more racially integrated. 
Parents perceived that voucher schools had 
fewer problems with safety, fighting, cheating, 
missing classes, being late for school, and 
destroying property. 

The frailty of positive findings from 
participation in voucher programs is suggested 
by the ad hoc and inconsistent ways that 
Peterson and co-authors explained findings 
from New York and from Milwaukee. In their 
analysis of the Milwaukee Parental Choice 
Program, Greene, Peterson, and Du found 
significant achievement effects only for 
students who had been in the program for 
three or four years. 24 They hypothesized that 
participation in a voucher program has a 
cumulative effect, with positive results only 
appearing in the third and fourth years, after 
students have been socialized in their new 
setting. In discussing the New York program, 
Peterson, Myers, and Howell hypothesized 
that they found significant results only for 
fourth- and fifth-grade students because 
vouchers are a more potent intervention for 
older students. They added that smaller 
classes may be more potent for younger 
students — an explanation at odds with the fact 
that students at voucher schools in the New 
York program attended smaller classes than 
students in the control group. 



In discussing their first year New York 
results, PMH argued that the magnitudes of 
the positive achievement effects observed “do 
not differ materially from those observed in” 
the Tennessee class-size reduction program. 25 
This comparison is problematic because of the 
instability of most of the SCSF findings 
compared with the Tennessee results. Charles 
Achilles, one of the Tennessee experiment 
principal investigators, pointed out that since 
the students in the SCSF evaluation are about 
95 percent minority, it might be more appro- 
priate to compare SCSF effect sizes with the 
effect sizes observed for Tennessee minority 
students. 26 When this comparison is made, the 
Tennessee effect sizes (between .30 and .40) 
are much larger and much more stable than the 
effect sizes reported by PMH (-.09 to .27). 

The Washington (D.C.) Scholarship Fund 

The Washington Scholarship Fund 
(WSF) was established in 1993 to provide 
vouchers to low-income students. Its funding 
comes from a variety of individuals including 
John Walton and Ted Forstmann and founda- 
tions such as the Lynde and Harry Bradley 
Foundation. In the fall of 1997, 460 WSF 
participants were attending 72 private schools. 
Beginning with the 1998-99 school year, the 
program planned to offer vouchers worth up to 
$2,200 to more than 1,000 students in grades 
K-8. No family with an income higher than 
2.5 times the poverty level may participate. 
Families with incomes that fall below the 
poverty line are eligible for vouchers worth up 
to 60 percent of the cost of private school 
tuition. 
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Parents Advancing Choice in Education 
(Dayton, Ohio) 

For the 1998-99 school year, the 
Parents Advancing Choice in Education 
(PACE) program in Dayton, Ohio, offered 
vouchers to 530 students previously enrolled 
in public schools and 250 students previously 
enrolled in private schools. The program 
pays up to 60 percent of the tuition at one of 
20 private schools participating in the pro- 
gram, up to a maximum of $1,200. The pro- 
gram is funded by the Thomas B. Fordham 
Foundation and a consortium of Dayton 
community leaders. 

The WSF and PACE programs are 
being evaluated by the Harvard Program on 
Education Policy and Governance, the North- 
ern Illinois University Social Science Re- 
search Unit, and (for the PACE program only) 
the University of Dayton. 27 In each program, a 
randomized design similar to that used to 
evaluate the New York School Choice Schol- 
arship program is being implemented. At this 
point, no achievement data are available for 
either program. 

San Antonio Private Voucher Programs 

San Antonio has two private voucher 
programs, both of which are funded by the 
CEO America Foundation. The first began in 
1992 and offers a voucher worth up to half the 
cost of tuition (to a maximum of $800) to any 
K-8 student eligible for free or reduced-price 
lunches who resides in Bexar County, Texas. 
Students may attend public or private schools. 
Godwin, Kemerer, and Martinez compared the 
effects of public school choice and private 
voucher programs in San Antonio. 28 The small 
number of students (85) for whom baseline 



(1991-92) and final-year (1995-96) test score 
data were available and the limited nature of 
the results make their achievement findings of 
little value. 

In April 1998 the CEO America 
Foundation and James Leininger committed 
$50 million over a period of 10 years to 
launch the Horizon Program. It is the first 
private voucher program in the country to 
offer a voucher to every low-income student 
within a single school district (the Edgewood 
Independent School District in San Antonio, 
Texas). Any K-12 student who is eligible for 
a free or reduced-price lunch and who resides 
in the district may participate. Vouchers may 
pay up to 100 percent of a participating 
school’s tuition, to a maximum of $3,600 
(grades K-8) for schools in the district and a 
maximum of $2,000 (grades K-8) for schools 
outside the district. For grades 9-12 the 
program pays up to $4,000 for schools in the 
district and up to $3,500 for schools outside 
the district. 29 

The evaluation of the Horizon Program 
is to be conducted by David Myers 
(Mathematica Policy Research), Paul Peterson 
(Harvard University), Jay Greene (University 
of Texas), and Rodolfo de la Garza (Thomas 
Rivera Policy Institute). Beginning with the 
1998-99 school year, the evaluation will 
compare the Edgewood School District to 
three similar school districts on a number of 
dimensions including student achievement. 

The first evaluation is due to be issued in 
1999. When this report went to press, no 
detailed information on the evaluation design 
was available. 
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Vouchers and Educational Equity 

The gap in funding between affluent 
and low-income districts in Pennsylvania 
already exceeds that in most other states. As 
of 1991-92, the last year for which comparable 
data have been collected for all 50 states, 
Pennsylvania had the 1 lth-largest gap in state 
and local funding per pupil between high- 
income and poor districts. 30 A major concern 
with vouchers is that they could further in- 
crease funding inequities and the stratification 
of students by income, race, and social back- 
ground. 

Vouchers could increase inequity by 
diverting money from students currently 
served by the public schools to students who 
already go to private schools. For example, 
rather than providing Milwaukee Public 
School children with choice, the expansion of 
the Milwaukee voucher and charter school 
programs appears to be diverting money from 
children in the public schools and subsidizing 
families who were already sending their 
children to private schools. According to 
Henry Levin (Stanford University), the 5,902 
students enrolled in either charter or voucher 
schools cost the Milwaukee Public Schools 
$29,214,900 in revenue in 1997-98. Of the 
5,902 voucher and charter school students, 
only 1,379 had attended the Milwaukee Public 
Schools the previous year. 31 

Levin estimated that a national 
voucher program that included all current 
private school students and that offered the 
full range of services provided by public 
schools would cost $33 billion annually. The 
costs of accommodating additional students in 
private schools, record-keeping and monitor- 
ing, and providing transportation would add 



another $40 billion, bringing the total to $73 
billion, about 25 percent of the current cost of 
public education nationally. 

New evidence from Arizona corrobo- 
rates the fear that a large-scale school choice 
program may increase stratification in the 
schools based on income, race, and ethnicity. 
Casey D. Cobb and Gene V. Glass found that 
Arizona charter schools are increasing racial 
segregation in public education. Minority 
students are disproportionately enrolled in 
charter schools with non-college-preparatory 
curricula. 32 Large-scale voucher programs 
would share many of the characteristics of 
Arizona’s largely unregulated charter school 
program and may, therefore, similarly reduce 
educational equity. 

There is evidence that all school 
choice programs, public school choice as well 
as voucher and charter school programs, 
increase student stratification by income and 
other family background characteristics but do 
not necessarily produce academic gains. 

Godwin, Kemerer, and Martinez, in 
their analysis of the characteristics of families 
that chose to participate in either public or 
private school choice programs in San 
Antonio, found significant differences be- 
tween choosing and non-choosing families. 
Choosing families had more education, higher 
incomes, higher employment levels, and fewer 
children, and were less likely to be on welfare, 
less likely to be African-American, and more 
likely to be two-parent families. Choosing 
families also had higher educational expecta- 
tions and were more active in their children’s 
education. In addition, their children had 
higher standardized test scores. 33 
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A 1992 Carnegie Foundation report 
evaluated choice programs around the country 
and reached the following conclusions. (1) To 
the extent that choice programs benefit chil- 
dren at all, they benefit the children of better 
educated parents. (2) Choice programs require 
additional money to operate. (3) Choice 
programs have the potential to widen the gap 
between rich and poor school districts. 

(4) School choice does not necessarily im- 
prove student achievement. 34 Bruce Fuller, in 
a 1995 review, drew conclusions similar to 
those of the Carnegie report. 35 

In a review of the research on school 
choice in three countries (the U.S., Great 
Britain, and New Zealand), Geoff Whitty 
found little evidence to support the contention 
that the creation of educational “markets” 
increases student achievement. He did, how- 
ever, find that educational “markets” make 
existing inequalities in the provision of educa- 
tion worse. 36 Martin Camoy drew a similar 
conclusion based on an analysis of the effects 
of school privatization in Chile and other 
countries. 37 



The political figure most closely 
identified with the contemporary voucher 
movement, Wisconsin state legislator Polly 
Williams, now expresses concerns about the 
political pressure to create voucher programs 
that would increase educational inequity. She 
told the Boston Globe in October 1998: 

I knew from the beginning that white 
Republicans and rich, right-wing 
foundations that praised me and used 
me to validate their agenda would do it 
only so long as it suited their needs. . . . 
This is why most black groups like the 
NAACP are against vouchers because 
without the income cap, choice just 
becomes a free-market program that 
keeps richer families happy and Catho- 
lic and Lutheran schools solvent with 
state money without any commitment 
to improve public schools. . . . Too 
many people in the voucher crowd 
exploit low-income black children, 
saying we are creating vouchers for 
them when what they really have in 
mind is bringing in a Trojan horse. . . . 
I’ve never seen a situation where low- 
income people, when they have to 
compete in education with people with 
far more resources, come out equal. 38 
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SMALLER CLASSES: A RESEARCH UPDATE 



The Recent History of Class-Size Research 

The current interest in class-size 
research can be traced to an influential and 
controversial 1 978 meta-analysis of class size 
studies from more than a dozen countries by 
Professor Gene Glass of Arizona State Univer- 
sity and Mary Lee Smith. 39 Glass and Smith 
concluded that small classes produce higher 
levels of student achievement than large 
classes. For example, they found that being 
taught in a one-on-one tutorial as opposed to a 
40-student class improved student perfor- 
mance by 30 percentile ranks. Glass and Smith 
argued that to be most effective classes should 
have about 1 5 students. 

Robinson and Wittebols criticized the 
Glass and Smith study for drawing conclu- 
sions from too few studies and relying too 
heavily on research on individual tutoring. 40 
Professor Robert Slavin of Johns Hopkins 
University also considered Glass and Smith’s 
analysis flawed because it did not adequately 
take into account qualitative distinctions 
between studies. 41 In Slavin’s view, except for 
studies of class sizes of one, Glass and 
Smith’s evidence that class-size reductions 
raised achievement was weak. 

The Tennessee Student-Teacher 
Achievement Ratio (STAR) Study 

Against this backdrop of controversy 
over the relationship between class size and 
student achievement, Tennessee launched the 
STAR program in the mid-1980s. Key Ten- 
nessee legislators knew of an Indiana class- 
size program and a class-size study conducted 
in Nashville. 42 They were particularly influ- 
enced by Glass and Smith’s meta-analysis, 



which suggested reducing class size to about 
1 5. Mindful of the cost of reducing class size, 
the legislature wanted to study the impact of 
reducing class size in the early grades before 
adopting a class-size reduction policy. 

In 1985, the Tennessee legislature 
passed, and Governor Lamar Alexander 
signed into law, funding for a statewide class- 
size experiment. The STAR study followed a 
group of students from kindergarten through 
third grade. Since Tennessee did not require 
kindergarten, many STAR students entered the 
study as first-graders. The STAR study began 
in the fall of 1985 in 79 schools within 42 
school districts throughout the state. 

Researchers classified schools as: 

(1) inner-city (metropolitan-area schools in 
which more than half the students received 
free or reduced-price lunches), (2) urban, 

(3) suburban, and (4) rural. 

Within each participating school, the 
state Department of Education randomly 
assigned teachers and students to one of three 
types of classes: small (S) classes (typically 
13-17 students), regular (R) classes (typically 
22-25 students), and regular classes with a 
full-time instructional aide (RA) (typically 22- 
25 students). 

To ensure that curriculum differences, 
leadership style, school climate, and other 
school-specific factors did not influence the 
results, all schools participating in the project 
had to be large enough to have all three types 
of classes at all four grade levels. The STAR 
project also required that there be no changes 
in participating schools other than the estab- 
lishment of the three types of classes. 
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STAR is one of the few truly scientific 
experiments ever conducted in education. It is 
also a large study that involved about 6,500 
students each year. In all, 1 1 ,600 different 
students participated in Project STAR, of 
whom 1 ,842 remained in the same type of 
class for all four years and 2,571 remained in 
the same type of class for grades 1-3. 

Students in STAR were tested in 
reading and math on the Stanford Achieve- 
ment Test and the Tennessee Basic Skills First 
test. STAR researchers compared improve- 
ments in achievement each year by each class 
type. They also compared the performance of 
students in small classes for three consecutive 
years with the performance of students in each 
type of regular class for three consecutive 
years. 

STAR researchers found that students 
in small classes outperformed students in both 
R and RA classes across the board in all 
geographical areas and at all grade levels. 
Regular classrooms with a teacher’s aide 
showed a slight but not statistically significant 
achievement advantage over regular class- 
rooms in first grade. Jayne Boyd-Zaharias and 
Helen Pate-Bain reported in 1998 that their 
analysis of STAR data found no achievement 
advantage for classes of 25 students with a 
full-time teacher’s aide compared to classes of 
25 without an aide. This was true in grades K- 
3 and in a follow-up study of students in 
grades 4-8. 43 

Averaged over four years, students in 
small classes had an advantage of a bit more 
than eight percentile ranks over students in 
regular classes in reading and a bit less than 
eight percentile ranks in math. The effect size 



in reading averaged over four years was about 
0.26. In math it was 0.23. 44 

In a May 1 997 reexamination of the 
STAR data, economist Alan Krueger of 
Princeton University confirmed the original 
findings of the STAR investigators 45 Krueger 
controlled for other measured factors that 
might influence performance, including 
student characteristics (race, gender, eligibility 
for free lunch, whether the student was new to 
the school, etc.) and teacher characteristics 
(race, gender, experience, and educational 
qualifications). Because students and teachers 
were initially placed at random into the three 
types of classes, these characteristics would 
not be expected to influence the impact of 
class size on performance. As anticipated, 
Krueger found that controlling for these 
variables has very little effect. He still found 
overall effect sizes that range from 0.19 to 
0.28 in each of the four years, similar to the 
range reported in the original STAR analysis 46 

The original STAR results may be 
understated because some classes labeled as 
small were actually larger than some labeled 
as large. (Since the number of students in a 
grade does not fall into multiples of 13-17 and 
22-25, it is unavoidable that small and regular 
classes be distributed around these targets.) A 
research team headed by Professor Barbara 
Nye and B. De Wayne Fulton (Tennessee State 
University) re-estimated the performance 
difference of small classes and regular classes 
after removing all small classes that did not 
have 12-14 students and all regular classes 
that did not have at least 23 students from the 
sample. They reported effect-size advantages 
for small classes that average 0.56 for reading 
and 0.47 for math. 47 




27 



28 



SMALLER CLASSES AND EDUCATIONAL VOUCHERS' 



KEVSTOMS 

RESEARCH 

CENTER 



The STAR study also found that small 
classes especially raised achievement in inner- 
city Tennessee classrooms with large concen- 
trations of minority students. 48 Jeremy Finn 
and Charles Achilles concluded in a recent 
review paper that “in most comparisons, the 
benefit to minority students is about two to 
three times as large as that for whites.”* 9 
Krueger also found that lower achieving, 
minority, and poor students benefit the most 
from attending smaller classes. 50 

Charles Achilles, Jeremy Finn, and 
Helen Bain reported that when both white and 
non-white Tennessee students began kinder- 
garten in small classes, 87 percent of white 
and 86 percent of non-white first graders 
passed the Basic Skills First reading test. For 
students who began kindergarten in regular 
classes, the non-white first grade pass rate 
trailed the white pass rate by 12 percentage 
points. 51 

In a review of the research literature on 
the white-black test gap and on class size, 
including the Tennessee experience, Steven 
Bingham concluded that small class size in the 
early grades is an effective achievement gap- 
reduction strategy. He maintained that minor- 
ity children should be placed in small classes 
early (preferably in kindergarten) and remain 
in a small class for at least two years. 52 

The STAR study found that small 
classes increased promotion rates from each 
grade. Over the four years of the study, 80.2 
percent of students in small classes moved up 
to the next grade the following year, compared 
with 72.6 percent of students in regular 
classes. Raising promotion rates for each 
grade saves money by reducing the number of 
students taught twice at each grade level. 53 



In addition, when more students are 
held back, the R and RA classes at the next 
grade level end up with fewer low-scoring 
students. If students in R and RA classes had 
been promoted at the same rate as those in 
small classes, the relative test scores of R and 
RA classes might have been even lower. The 
higher retention-in-grade rates of R and RA 
classes may cause the estimate of the addi- 
tional benefit of several years in a small class 
to be understated. 

Finally, the Tennessee experiment 
provides some evidence that small classes 
mitigate the negative effect of large schools 
documented by William Fowler and Herbert 
Walberg (University of Illinois at Chicago). 54 
According to Achilles, students in regular 
classes achieved less well in large schools 
than small schools. Students in small classes 
did as well or nearly as well in large schools 
as in small schools. 55 

Because of the STAR study’s size and 
careful design, Harvard Professor Frederick 
Mosteller, in a report to the American Acad- 
emy of Arts and Sciences, characterized the 
study as “one of the great experiments in 
education in United States history.” 56 Never- 
theless, debate about the policy implications 
of the STAR results continues. 

Are the Benefits of Smaller Classes Cumula- 
tive and Do the Benefits Last? 

Recent debate about the Tennessee 
STAR experiment centers on two questions 
that turn out to be related. ( 1 ) Are the benefits 
of smaller classes cumulative? (2) Do the 
benefits last? 




28 



29 



SMALLER CLASSES AND EDUCATIONAL VOUCHERS 



KfYSTtM 

RESEARCH 

C#9tEiR! 



Initial research on the STAR experi- 
ment indicated that most of the gain appeared 
the first year children attended a smaller class, 
with the achievement gap between small and 
regular classes holding steady but not increas- 
ing in subsequent years. Based on this under- 
standing, Eric Hanushek argued in a February 
1998 paper that the Tennessee results support, 
at most, movement toward small kindergarten 
and first-grade classes. 57 

Contrary to Hanushek’s conclusion, an 
increasing body of research indicates that 
achievement benefits do increase with addi- 
tional years in small classes. Krueger found 
that while the achievement of students in 
small classes jumped by about four percentile 
ranks in the first year a student attended a 
small class, it improved by almost an addi- 
tional percentile rank for each additional year. 
The initial effect was highly significant and 
the incremental improvement in subsequent 
years was on the margin of statistical signifi- 
cance. 58 

Additional new research on STAR 
students relies on a data base constructed for 
the Lasting Benefits Study (LBS), an analysis 
of the achievement of small- and regular-class 
STAR students in higher grades. A STAR 
student is defined in the LBS as any student 
who spent at least third grade in a STAR 
classroom. 59 

Through eighth grade, the original 
LBS studies found that students in small 
classes during part or all of K-3 continued to 
outperform graduates of R and RA classes by 
statistically significant amounts. 60 The 
achievement advantage for minority students 
who participated in small classes remained 



larger than that for white students. 61 Lasting 
benefits from small K-3 classes were found in 
a wide spectrum of subjects, including read- 
ing, language, math, study skills, science, and 
social studies. 62 

The Lasting Benefits Study showed 
eighth-grade effect sizes of 0.04 to 0.08, 63 
seventh-grade effect sizes that ranged from 
0.08 to 0.16, 64 sixth grade effect sizes of 0.14 
to 0.26, 65 fifth grade results ranging from 0.17 
to 0.34, 66 and fourth grade effect sizes of 0. 1 1 
to 0.16. 67 STAR students from small classes 
continue to outperform students in regular 
classes but the presence of a teacher’s aide 
continued to have very little, if any, impact on 
achievement. 

The new research using the LBS data 
base separately examined children who at- 
tended small classes for one, two, three, or 
four years. Barbara Nye, Larry V. Hedges and 
Spyros Konstantopoulos (NHK) found that 
statistically significant benefits from small 
classes persisted to eighth grade only for 
students who spent at least two grades in a 
small class. On eighth-grade math, reading, 
and science tests, the effect size for students 
who attended small classes for four years was 
0.3 to 0.37, similar to the effect size for these 
students in grades four and six. By eighth 
grade, the achievement benefit of spending 
four years in small classes equaled four or 
more times that of spending one year in a 
small class, 80 percent more than that of 
spending two years in small classes, and 20 to 
30 percent more than that of spending three 
years in small classes. In other words, the 
incremental benefit of each additional year in 
small classes appears to be roughly the same 68 
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David Grissmer of RAND pointed out 
that NHK’s findings imply a bigger benefit 
from the third and fourth years of small 
classes than Krueger’s estimates (which are 
described in the third paragraph of this sec- 
tion). 69 Grissmer hypothesized that this may 
partly reflect differences between the NHK 
and Krueger samples. Since NHK used the 
Lasting Benefits Study (LBS) data base, all the 
students in their sample took the third-grade 
test. Many of the students in their sample with 
one and two years in small classes entered 
small classes in second and third grade. In the 
lull STAR sample that Krueger used, a high 
proportion of the students in small classes for 
one and two years entered and reported test 
scores in kindergarten and first grade. In 
different ways, then, both the Krueger and 
NHK results underscore the value of small 
classes in kindergarten and first grade. They 
also point to the benefits of additional years in 
small classes, with the NHK results raising 
questions about the durability of the “jump” in 
achievement after one year if it is not consoli- 
dated by additional years in small classes. 

Finn and Achilles added another 
dimension to the literature on the lasting 
benefits of small classes by converting the 
achievement difference between small and 
regular class students into “grade equiva- 
lents.” 70 The effect sizes normally used to 
measure the achievement benefit of small 
classes divide the average test score difference 
in a grade by the variability (or standard 
deviation) of that test score. Student perfor- 
mance, however, varies more in higher grades, 
increasing the denominator in effect-size 
measurements. 



Grade equivalents (GEs) offer another 
way of looking at the impact of smaller 
classes. The grade equivalent of a test score is 
the grade level for which that score was the 
median score. 71 For example, if the median 
test score of students with four months of 
fourth grade was 100, the grade-equivalent of 
a score of 100 would be third grade plus four 
months. Using grade equivalents, a difference 
in average test scores between small and 
regular classes can be converted into a differ- 
ence measured in grade-equivalent months of 
schooling. Table 3 converts the benefits from 
small K-3 classes into GE months of school- 
ing. Table 4 does the same for the benefits of 
attending four years in small classes. 

Based on their GE analysis, Finn and 
Achilles concluded that the achievement effect 
of being in a small class continues and gener- 
ally increases from grade to grade 72 

In September 1997, Health and Educa- 
tion Research Operative Services (HEROS), 
Inc., published a study of the extent to which 
10 th -grade students who had been enrolled in 
STAR small K-3 classes retained an achieve 
ment advantage over students who had been in 
regular classes and regular classes with a 
teacher’s aide. 73 The study analyzed the 
relative performance of these students on the 
Tennessee Competency Test. It found that the 
performance of students who had attended 
small classes was not significantly better than 
that of students who had been in regular 
classes. However, the researchers did find that 
significantly more of the former small-class 
students than regular-class students had passed 
the test by eighth grade. 
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Table 3. 

The Tennessee K-3 Small-Class Advantage Measured in 
Grade Equivalent Months of Schooling 





Kindergarten 


Grade 1 


Grade 2 


Grade 3 


Mathematics 


1 .6 months 


2.8 months 


3.3 months 


2.8 months 


Reading 


0.5 months 


1.2 months 


3.9 months 


4.6 months 


Word Study Skills 


0.5 months 


0.8 months 


4.7 months 


5.7 months 



Source; Jeremy D. Finn, Susan B. Gerber, Charles M. Achilles, and Jayne Boyd-Zaharias. “Short and Long-term Effects 
of Small Classes,” paper prepared for conference on the Economics of School Reform, May 23-26, 1999, available from 
finn@acsu.buffalo.edu. 



Table 4. 

The Achievement Benefits in Grades 4, 6 and 8 of Having Spent Four Years 
in Small K-3 Classes, Measured in Grade Equivalent Months of Schooling 





Grade 4 


Grade 6 


Grade 8 


Mathematics 


5.9 months 


8.4 months 


1 year, 1 month 


Reading 


9.1 months 


9.2 months 


1 year, 2 months 


Science 


7.6 months 


6.7 months 


1 vear. 1 month 



Source: same as Table 2. 



STAR students who graduated on 
schedule would have completed high school in 
spring 1998. In April 1999, Alan Krueger and 
Diane Whitmore reported preliminary results 
of an analysis of the rate at which a sample of 
9,397 STAR study participants took college- 
entrance exams (the ACT and SAT tests) as 
seniors. 74 Overall, 43.7 percent of students 
assigned to a small class in their first Project 
STAR year took the ACT or SAT exam, 
compared to 40 percent of students in regular 
classes and 39.9 percent of students in regular 
classes with an aide. These differences be- 
tween S-class students and R- and RA-class 
students were statistically significant at the 
0.05 level. 



Attending small classes raised the 
proportion of black students who took a 
college entrance exam by substantially more; 
40.2 percent of black students in small classes 
took either the ACT or SAT, compared to 3 1 .7 
percent of students in regular classes. Attend- 
ing a small class reduced the black-white gap 
in college-entrance test-taking by 54 percent. 

Students initially assigned to a class 
with 21-25 students were more likely to take 
the ACT or SAT exam than students who were 
assigned to classes with 26-30 students. They 
were less likely to take one of the exams than 
students initially assigned to classes with lb- 
20 students. 
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Even though significantly higher 
proportions of small-class students took the 
college-entrance exams, their average scores 
were virtually the same as those of students in 
regular-size classes. The same held true for 
the subgroups examined. 

Preliminary findings from another 
ongoing study, based on the high school 
experiences of more than 3,000 former STAR 
participants, showed that 72 percent of small- 
class participants graduated from high school 
on schedule compared to 66 percent of 
regular-class students and 65 percent of 
students from regular classes with a teacher’s 
aide. While 23 percent of regular-class stu- 
dents and 26 percent of regular-class-with- 
aide students dropped out, only 1 9 percent of 
small-class students dropped out. 75 

Future research on the STAR students 
by HEROS, Inc., will focus on experience in 
higher education and on social outcomes such 
as juvenile detention, adult imprisonment, 
welfare, and employment experience. 



Project Challenge 

Beginning in 1 989, Tennessee fol- 
lowed up its STAR experiment by establishing 
Project Challenge, which provided the money 
necessary to reduce K-3 class size in 16 of the 
state’s poorest school districts. These districts 
typically placed low on achievement rankings 
of Tennessee’s 138 school districts. After the 
implementation of Project Challenge, student 
achievement in math and reading improved 
both in comparison to the performance of 
previous students in these districts and in 
relation to other schools in the state. 76 Be- 
tween 1989-90 and 1993-94, Project Chal- 
lenge school districts’ average ranking on 
grade-two test results improved from 97 th - 
highest to 78 th -highest in reading and from 
90 ,h -highest to Sb^-highest in math. There- 
fore, student achievement in these poor dis- 
tricts in 1993-94 was only a little below that of 
the median district in the state in reading and 
above the median in math. 



Key Findings from Analyses of the Tennessee STAR Experiment 

1 . On every achievement measure in every year through eighth grade, there were statistically significant differences 
between the performance of students in small classes and those in the two types of regular classes. 

2. Every type of district -- inner-city, urban, suburban, and rural -- enjoyed significant gains from small classes. 

3. In each grade, minorities and students attending inner-city schools enjoyed greater small-class advantages than whites 
on some or all measures. 

4. The same benefits from small classes were found for boys and girls alike. 

5. Rural small classes achieved the highest test scores. 

6. For students who spent all four years (K-3) in small classes, the average achievement advantage on math, reading, 
and science tests grows from 6-9 months of schooling in grade four to more than one year of schooling in grade eight. 

7. Students who attended small classes took college-entrance exams at significantly higher rates than students who 
attended the two types of regular classes. 

8. Students who attended small classes graduated from high school on schedule at significantly higher rates than 
students who attended the two types of regular classes. 
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Wisconsin: Student Achievement 
Guarantee in Education (SAGE) 

Wisconsin implemented its statewide 
Student Achievement Guarantee in Education 
(SAGE) program in 1996-97. SAGE seeks to 
increase the academic achievement of children 
living in poverty by reducing the student- 
teacher ratio in kindergarten through third 
grade to 15:1. 77 Participation in SAGE re- 
quires a school to implement a rigorous 
academic curriculum, provide before- and 
after-school activities for students and com- 
munity members, and implement professional 
development and accountability plans. 

All districts with a school that enrolls 
50 percent or more low-income children 
participated. Within these districts, any school 
enrolling 30 percent or more low-income 
children could apply. Each eligible district 
except Milwaukee could designate one school 
as a SAGE school. Milwaukee was allowed 10 
SAGE schools. 

Schools entering the program had to 
agree to remain in SAGE for its five-year 
duration. They also had to submit an annual 
“Achievement Guarantee Contract” to the 
state Department of Public Instruction. This 
contract explains how the school plans to 
implement the SAGE program requirements. 
Schools are allowed wide latitude in develop- 
ing their plans. Upon accepting a school into 
SAGE, the state provides up to an additional 
$2,000 per low-income student enrolled in 
SAGE classrooms. The original legislation 
specified that no new schools would be admit- 
ted after the start of the 1 996-97 school year. 



However, SAGE proved so popular that the 
state legislature agreed to expand it beginning 
with the 1998-99 school year. 

SAGE is designed to be implemented 
in stages. Kindergarten and first-grade classes 
entered the program in 1996-97, second grade 
was added in 1997-98, and third grade in 
1998-99. All classrooms at the appropriate 
grade level in participating schools must have 
a student-teacher ratio of no more than 15:1. 
During the 1996-97 school year, SAGE was 
implemented in 30 schools in 21 school 
districts throughout Wisconsin. 

The legislation creating SAGE requires 
an annual evaluation of the program and a 
fifth-year final report on the impact of the 
program on academic achievement. Alex 
Molnar and co-researchers at the School of 
Education at the University of Wisconsin- 
Milwaukee are conducting this legislatively 
mandated evaluation. 78 SAGE schools are 
being compared to a group of 14-17 non- 
SAGE schools (the exact number depending 
on the year) in SAGE districts. Students are 
tested in reading, language arts, and math on 
the Comprehensive Test of Basic Skills 
(CTBS) Complete Battery, Terra Nova edition. 

Comparison schools were selected for 
their similarity to one or more individual 
SAGE schools in demographic composition, 
school size, initial third-grade test scores, and 
percentage of low-income students. In addi- 
tion to quantitative analysis, the SAGE re- 
search plan contains extensive qualitative 
research, including interviews of teachers and 
principals, surveys of teachers, examination of 
teacher logs, and classroom observation. 
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Why Are Small Classes So Effective? 

The STAR, SAGE, and other studies reviewed in this report suggest that small classes promote higher achievement for 
several mutually reinforcing reasons. 

• Children receive more individualized instruction: one-on-one help, small-group help, class participation. 

• Children misbehave less because of the family atmosphere and quick intervention by teachers. 

• Teachers spend more time on direct instruction and less on classroom management. 

• Classes include more “hands-on” activities although most instruction remains teacher - not student - centered. 

• Students become more actively engaged in learning than peers in large classes. 

• Teachers of small classes “bum out” less often. 



The SAGE evaluation established a 
baseline measure of performance for partici 
pating students by testing first-graders in the 
fall and the spring (beginning in 1996-97). 
Second-graders (beginning in 1997-98) and 
third-graders (beginning in 1998-99) are tested 
in the spring. The SAGE evaluation will track 
through third grade students who were first 
graders in the program in 1996-97, 1997-98, 
and 1998-99. In any given first-grade year, the 
number of SAGE students with valid test 
scores (1300) is somewhat smaller than in 
Tennessee’s STAR experiment. The control 
group of 850 students is substantially smaller 
than the combined regular-class and regular- 
class-with-aide groups in STAR (4,000 stu- 
dents). However, over the three first-grade 
classes as a whole, the SAGE small-classes 
group with valid test scores is expected to 
include about 4,000 students and the compari- 
son group about 2,500. 

Thus far, the SAGE evaluation has 
published reports for the 1996-97 school year 
and the 1 997-98 school year. The results 
appear consistent with those reported for the 
Tennessee STAR experiment. (Precise com 



parisons must await parallel application of 
similar research methods to the two data sets.) 

• In 1996-97 and again in 1997-98, students 
in SAGE first-grade classrooms scored 
significantly higher in all areas tested. The 
first-grade effect sizes are in the range of 
0.1 to 0.3, depending on the statistical 
method used. 

• From spring 1997 to spring 1998, second- 
grade SAGE students’ scores increased 
more than those of comparison-school 
students but not by statistically significant 
amounts (at the .05 level). Over the two 
years taken together, SAGE second- 
graders showed statistically significant 
gains in language arts, mathematics, and 
total score, but not in reading. 

• The achievement benefit of SAGE small 
classes is especially strong for African- 
American students. In 1997-98, for 
example, African-American students in 
SAGE classes increased their average total 
score by 52 points compared to 33 points 
for African-Americans in comparison 
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schools. For whites, SAGE school first- 
grade test scores increased by 46 points 
compared to 41 in comparison schools. 
Thus, African-American SAGE first-grade 
students closed the “achievement gap” 
with white students over the course of the 
school year. However, the gap widened 
substantially in comparison schools. 

• In 1997-98, there was no significant 
difference between student achievement in 
SAGE first-grade classes with two teach- 
ers and up to 30 students and student 
achievement in classes with one teacher 
and up to 15 students. If sustained in 
subsequent evaluations, this finding would 
have considerable significance for policy 
and practice. By adding teachers to larger 
classes, school districts that lack the 
resources to build new classrooms could 
reap the benefits of small classes. 

• Analyses of qualitative data suggest that 
teachers in SAGE classrooms have greater 
knowledge of each of their students, spend 
less time managing their classes, have 
more time for instruction, and use more 
individualized instruction. 

California 

In the 1996-97 school year, California 
appropriated almost $11 billion to implement 
an ambitious class-size reduction program. In 
the first year, districts received $650 for each 
student enrolled in a class of no more than 20 
students. The 1997 California budget raised 
the allotment to $800 per student and con- 
tained an additional $1.5 billion for class-size 
reduction. Schools must start by reducing 
class size in first grade, then in second grade, 



and then in either kindergarten or third grade. 
The program’s popularity is illustrated by the 
fact that, by February 1997, 92 percent of all 
first graders and 74 percent of all second 
graders were attending small classes. By 
1997-98, 873 of 895 eligible school districts 
were receiving aid under the program and 
1 8,400 new classes had been added. 79 

Randy Ross, a social scientist working 
for school reform in Los Angeles, sharply 
criticized the California program for doing too 
much, too fast. 80 By implementing class-size 
reduction across the board, he claimed, the 
state exacerbated an existing teacher shortage. 
California’s Legislative Analyst Office made a 
similar criticism: 

The CSR [class size reduction] pro- 
gram resulted] in the hiring of about 
18,400 teachers [in 1996-97] ... in 
addition to the approximately 16,000 
elementary teachers that will be hired 
for normal replacement. . . . Twenty- 
four percent of teachers hired for CSR 
are not credentialed and are working 
under an emergency permit or waiver. 
School districts rate teachers hired for 
CSR as being less skilled, on average, 
than teachers hired in previous years. 
At the same time districts are hiring 
less qualified teachers, most are also 
experiencing difficulties in implement- 
ing staff development for those teach- 
ers. 81 

With statewide class-size reduction, 
the best and most qualified teachers had their 
choice of districts in which to work. As some 
of these teachers abandoned inner-city 
schools, these schools hired more teachers 
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without credentials. In Los Angeles, two- 
thirds of new teachers hired were without 
credentials. 82 

The California legislature appropriated 
$1 .75 million for a three-year study of the 
impact of the class-size reduction program. 
The research will be conducted by a consor- 
tium of research organizations (WestEd, 
PACE, American Institutes for Research, 
RAND, and EdSource). The aim is to encour 
age information-sharing and learning by 
practitioners as well as to add to the research 



literature. The research design will focus on 
successive cohorts of third- and fourth-graders 
who have and have not attended smaller 
classes. 

Press reports based on test data com- 
piled by the California Department of Educa- 
tion indicate that second- and third-grade 
students in classes of 20 or fewer were scoring 
above the national average in reading and 
math at higher rates than students in larger 
classes. However, these data have not been 
subjected to rigorous analysis. 83 
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CONCLUSION 



There is no longer any argument about 
whether reducing class size in the primary 
grades increases student achievement. The 
research evidence is quite clear: it does. 

Policymakers considering education 
reforms to improve the achievement of low- 
income children should carefully consider the 
strength of the evidence and the quality of the 
research on smaller classes. In policymaking, 
there is sometimes a tendency to regard all 
studies and research reports as being created 
equal. They are not. As Princeton University 
economist Alan Krueger put it, referring to the 
STAR study, ’’One well designed experiment 
should trump a phalanx of poorly controlled, 
imprecise observational studies based on 
uncertain statistical specifications .” 84 

In contrast, the claim that participation 
in a voucher program increases student 
achievement remains weak. The most care- 
fully analyzed voucher program, the Milwau- 
kee Parental Choice Program, included a small 
number of students, many of whom left the 
program each year. Although two out of three 
analyses found positive achievement advan- 
tages in math (but not in reading) for voucher 
students, these results were derived by apply- 
ing complex and sometimes controversial 
analytic methods to weak data. As Cecilia 
Rouse, the most sophisticated researcher to 
analyze the Milwaukee data, pointed out, data 
limitations threaten the validity of any evalua- 
tion of the Milwaukee voucher program. 
Statistical techniques cannot substitute for 
better data . 85 

Similar problems bedevil the evalua- 
tion of the Cleveland voucher program. The 
official evaluation of Cleveland found no 



significant differences between voucher and 
public school students in one year and gains 
for voucher students in only one subject, 
language arts, in a second year. Reminiscent 
of the Milwaukee evaluation debates, a team 
of researchers led by Paul Peterson of Harvard 
reexamined the official data, made two contro- 
versial methodological assumptions, and 
pronounced the Cleveland voucher program a 
success. 

Faced with the ambiguity of the exist- 
ing evidence, some may argue that we need 
more voucher experiments. This is one of the 
arguments being used to justify the expansion 
of the private voucher programs described in 
this report. More reliable data may emerge in 
the next several years from some of these 
programs. 

The problem with research on small- 
scale voucher experiments, however, is not 
only the lack of clear performance effects. 
More fundamentally, the problem is that such 
small-scale programs — no matter how crystal 
clear their achievement consequences — can 
tell us little about larger-scale programs. 
Voucher evaluations are less informative than 
class-size research because “vouchers” do not 
represent a specific educational reform. If a 
voucher program generates positive effects, 
the research does not generally look inside the 
schools to ask what explains the success. It 
simply assumes that private is better. 

A second reason that voucher research 
tells education policymakers little relates to 
the issue of scale. As research on private 
schools shows, some private schools appear to 
raise achievement through “peer effects” — 
by placing low-income students with other 
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students from more privileged families who 
place a high priority on education. (Elite 
private schools also tend to spend large 
amounts of money per student and to have 
smaller classes.) But in a large-scale voucher 
program, peer effects could be quite different 
than in a small-scale program. This may help 
explain why new schools that enroll voucher 
students in Cleveland perform less well than 
public schools while established private 
schools perform better than public schools. 



For these reasons, the only way to find 
out the impact of a large-scale voucher pro- 
gram is to implement one. However, there is 
no strong evidence that this would improve 
achievement. In addition, such a large-scale 
program would likely raise spending on 
students who already attend private schools 
and reduce educational spending on children 
currently in public school. 
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IMPLICATIONS FOR PENNSYLVANIA 



As the Tennessee and now the Wiscon- 
sin class-size research results have become 
more widely known, reducing class size has 
become a favorite of state and federal legisla- 
tors, as well as parents, across the country. In 
California, small classes have been introduced 
so rapidly and on so large a scale that the 
achievement benefits and the cost-effective- 
ness of reform may be reduced. California’s 
class-size reduction has exacerbated teacher 
shortages and meant that one quarter of teach- 
ers hired to lower class sizes have only emer- 
gency credentials. Low-income areas may 
also have lost experienced teachers as class- 
size reduction created openings in more 
affluent areas. 

Pennsylvania has a rare opportunity to 
introduce a class-size reduction program 
targeted on the areas in which it would gener- 
ate the greatest benefits and designed in a way 
that would generate knowledge of how to 
improve educational achievement in a cost- 
effective manner. Such a SMART (Scientific 
Methods, Achieving Results Today) class- 
size program should begin by reducing class 
size in kindergarten and first grade. As in 
Wisconsin, priority should be placed on 
lowering class size in schools that serve high 
proportions of low-income students. Selective 
introduction of small K-l classes in the rest of 
the state would permit additional scientific 
analysis of the benefits of small classes. 

Pennsylvania should also take a scien- 
tific approach to evaluating the additional 
benefits of small classes in second and third 
grade. Building on Wisconsin’s experience, 



Pennsylvania should evaluate the benefits of 
combining class-size reductions with 
other (e.g., curricular and teacher training) 
innovations. 

As SMART class-size program stu- 
dents progress through higher grades, Pennsyl- 
vania should track social indicators of well- 
being as well as achievement test scores. In 
Wisconsin, the initial interest in smaller 
classes stemmed from their potential social as 
well as achievement benefits. A statewide 
Urban Initiative task force (which included 
bipartisan legislative and business leaders) 
believed that smaller K-3 classes might reduce 
youth violence by increasing the chance that 
children entering school will find an adult who 
knows and cares for them. 

The Tennessee STAR experiment 
represents not just a shining example of 
scientific educational research but also an 
inspiring illustration of politics at its best. The 
demonstration resulted from a compromise 
between legislators who wanted widespread 
class-size reductions and those who consid- 
ered them too expensive given the quality of 
the evidence on their benefits. 

Pennsylvania now has a chance to 
achieve a similarly historic advance. It can 
invest in high-payoff class-size reduction for 
low-income students while conducting sys- 
tematic analysis of what additional invest- 
ments would make sense. A dozen years from 
now, such a program could win for Pennsylva- 
nia the kind of recognition now accorded the 
Tennessee STAR experiment. 
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